Salbutamol inhalers are used to relieve the symptoms of respiratory conditions, such as asthma and chronic obstructive pulmonary disease (NHS, 2025). Patients may use their salbutamol inhaler when they experience symptoms, such as breathlessness, tightness in their chest, coughing and wheezing (NHS, 2025). These symptoms are usually exacerbated by the cold weather (Asthma and Lung UK, 2023).
Therefore, this question aims to determine whether there is a relationship between the number of prescriptions for salbutamol inhalers (including branded salbutamol inhalers) and the proportion of households without central heating in NHS Scotland healthboards. We will look at the months which are the coldest in Scotland, the winter months, which for 2021/2022 were December, January and February.
pacman::p_load(tidyverse, janitor, here, readxl, plotly, sf, gt)
#function to clean datasets
clean_prescriptions <- function(df){
df %>%
drop_na() %>%
clean_names()}
data_dec2021 <- read_csv("https://www.opendata.nhs.scot/dataset/84393984-14e9-4b0d-a797-b288db64d088/resource/ad9e7b46-47fb-4d42-baad-f8e98e8f5936/download/pitc202112.csv") %>%
clean_prescriptions()
data_jan2022 <- read_csv("https://www.opendata.nhs.scot/dataset/84393984-14e9-4b0d-a797-b288db64d088/resource/53a53d61-3b3b-4a12-888b-a788ce13db9c/download/pitc202201.csv") %>%
clean_prescriptions()
data_feb2022 <- read_csv("https://www.opendata.nhs.scot/dataset/84393984-14e9-4b0d-a797-b288db64d088/resource/bd7aa5c9-d708-4d0b-9b28-a9d822c84e34/download/pitc202202.csv") %>%
clean_prescriptions()
#function to filter for salbutamol inhalers and summarise inhaler prescriptions
salbut_relievers <- c("SALBUTAMOL", "VENTOLIN", "AIROMIR", "ASMALAL", "EASYHALER", "PULVINAL", "SALAMOL", "EASI-BREATHE", "SALBULIN") #names of salbutamol and brand equivalents
salbut_presc <- function(df){
df %>%
filter(str_detect(bnf_item_description,
paste(salbut_relievers, collapse = "|"))) %>%
filter(!(str_detect(bnf_item_description,
"FORMOTEROL|BECLOMET|BUDESONIDE|FOBUMIX"))) %>%
select(c(hbt, bnf_item_description, paid_quantity)) %>% #prepare dataset
group_by(hbt) %>%
summarise(prescribed_salbutamol = sum(paid_quantity)) %>% #add new column for sum
subset(hbt != "SB0806") #removing row with code SB0806
}
#filtering datasets for each winter month in 2021/2022
salbut_dec2021 <- salbut_presc(data_dec2021) %>%
rename("salbutamol_dec21" = prescribed_salbutamol)
salbut_jan2022 <- salbut_presc(data_jan2022) %>%
rename("salbutamol_jan22" = prescribed_salbutamol)
salbut_feb2022 <- salbut_presc(data_feb2022) %>%
rename("salbutamol_feb22" = prescribed_salbutamol)
In this code chunk, I created a function that uses regeular expressions to filter for salbutamol and its alternative brand names prescribed in the NHS (NHS, 2025). I then filtered it to remove any inhaler types that have the same brand name but a different chemical compound e.g. EASYHALER_BECLOMET 200MCG (200 D) (not salbutamol). Then I found the total of all salbutamol inhalers prescribed per healthboard.
I removed data related to the code for Scottish Ambulance Service (SB0806), as my question does not relate to this (Public Health Scotland, 2021).
#join datasets for 2021/2022 together
join_prescription <- function(df1, df2){
df1 %>%
full_join(df2, join_by("hbt" == "hbt"))
}
#join winter 2021/2022 data together
joined_salbut_2122 <- join_prescription(salbut_dec2021, salbut_jan2022)
joined_salbut_2122 <- join_prescription(joined_salbut_2122, salbut_feb2022)
In the code chunk above, I join all the totals of salbutamol prescriptions for each month together.
HB_names <- read_csv("https://www.opendata.nhs.scot/dataset/9f942fdb-e59e-44f5-b534-d6e17229cc7b/resource/652ff726-e676-4a20-abda-435b98dd7bdc/download/hb14_hb19.csv") %>%
clean_names()
joined_salbut_2122 %<>% full_join(HB_names, join_by("hbt" == "hb")) %>%
select(c(hbt, hb_name, starts_with("salbutamol"))) %>%
filter(!is.na(salbutamol_dec21)) #rows contain old area codes (now archived)
Next, I joined the prescription data to names of the healthboards.
Area codes from row 15 to 18 are archived healthboard codes, but which are now archived due to changes (Public Health Scotland, 2024). Therefore we can remove them.
#join population data
population_data <- read_csv("https://www.opendata.nhs.scot/dataset/7f010430-6ce1-4813-b25c-f7f335bdc4dc/resource/27a72cc8-d6d8-430c-8b4f-3109a9ceadb1/download/hb2019_pop_est_03102025.csv")
population_data %<>%
filter(Year == "2022" & Sex == "All") %>%
select(c(HB, AllAges)) %>%
filter(HB != "S92000003") #remove country code for Scotland
salbut_population_2122 <- joined_salbut_2122 %>%
full_join(population_data, join_by("hbt" == "HB")) %>%
rename("population" = AllAges)
S92000003 is the country code for Scotland, so this row tells us the population of Scotland in this year. So, we can remove it. Then I joined the population data with the prescription data.
#find prescription per 100k of population
salbut_population_2122 %<>%
rowwise() %>%
mutate(average_salbut = mean(c_across(starts_with("Salbutamol")))) %>%
mutate(average_salbut_per100k = (average_salbut*100000)/population)
#plot table
salbut_population_2122 %>%
select(c(hb_name, starts_with("salbut"), starts_with("average"))) %>%
gt() %>%
cols_label(hb_name = "Healthboard",
salbutamol_dec21 = "December 2021",
salbutamol_jan22 = "January 2022",
salbutamol_feb22 = "February 2022",
average_salbut = "Averages across Dec 2021 to Feb 2022",
average_salbut_per100k = "Averages across Dec 2021 to Feb 2022 per 100k of the population") %>%
cols_align(align = "center", columns = !hb_name) %>%
fmt_number(columns = starts_with("Average"), decimals = 1) %>%
tab_header(title = "Prescriptions of salbutamol and branded equivalents in Winter 2021/22",
subtitle = "Data from Scottish Healthboards") %>%
tab_spanner(label = "Salbutamol prescriptions",
columns = c(salbutamol_dec21: salbutamol_feb22))
| Prescriptions of salbutamol and branded equivalents in Winter 2021/22 | |||||
| Data from Scottish Healthboards | |||||
| Healthboard |
Salbutamol prescriptions
|
Averages across Dec 2021 to Feb 2022 | Averages across Dec 2021 to Feb 2022 per 100k of the population | ||
|---|---|---|---|---|---|
| December 2021 | January 2022 | February 2022 | |||
| NHS Ayrshire and Arran | 53452 | 42913 | 42147 | 46,170.7 | 12,633.9 |
| NHS Borders | 11275 | 9716 | 9412 | 10,134.3 | 8,675.2 |
| NHS Dumfries and Galloway | 29285 | 23135 | 23268 | 25,229.3 | 17,307.6 |
| NHS Forth Valley | 39882 | 32939 | 32051 | 34,957.3 | 11,544.3 |
| NHS Grampian | 50023 | 39070 | 37530 | 42,207.7 | 7,248.4 |
| NHS Highland | 33317 | 24375 | 23281 | 26,991.0 | 8,339.8 |
| NHS Lothian | 102904 | 81019 | 75596 | 86,506.3 | 9,550.3 |
| NHS Orkney | 1233 | 988 | 910 | 1,043.7 | 4,737.5 |
| NHS Shetland | 1629 | 1334 | 1166 | 1,376.3 | 5,978.9 |
| NHS Western Isles | 3984 | 3513 | 2872 | 3,456.3 | 13,232.5 |
| NHS Fife | 29045 | 25488 | 23370 | 25,967.7 | 6,992.0 |
| NHS Tayside | 38689 | 29882 | 28363 | 32,311.3 | 7,799.6 |
| NHS Greater Glasgow and Clyde | 281684 | 233148 | 219377 | 244,736.3 | 20,754.4 |
| NHS Lanarkshire | 86363 | 67401 | 67132 | 73,632.0 | 11,016.5 |
In this code chunk, I calculated the mean salbutamol prescriptions over the 3 months. Additionally, I calculate the number of salbutamol prescriptions per 100k of the population of each healthboard, which allows for comparisons between healthboards to be made.
The table shows the results.
#load data for central heating
heating_data2022 <- read_excel(here("data", "table_2025-11-12_19-17-58.xlsx"))
#tidy dataset and change values in hb_area column to include NHS + healthboard name
heating_data2022 %<>%
rename(c("hb_area2019" = ...2,
"occupied_households" = ...3,
"no_central_heating" = ...4)) %>%
select(c(hb_area2019:no_central_heating)) %>%
mutate(hb_area2019 = paste("NHS", hb_area2019)) %>% #add NHs to name of healthboard
filter(!row_number() %in% c(1:9, 24:28)) #remove, since rows contain no data
#change data for no_central_heating and occupied_households into integers (from characters) and add a column that shows the proportion of households without central heating
heating_data2022 %<>% mutate(no_central_heating = as.numeric(no_central_heating),
occupied_households = as.numeric(occupied_households)) %>%
mutate(percent_no_heating = (no_central_heating/occupied_households)*100) %>% #add column percent_no_heating
select(c(hb_area2019, percent_no_heating))
To find the dataset for census data regarding central heating, go to: https://www.scotlandscensus.gov.uk/webapi/jsf/tableView/tableView.xhtml
I filtered to remove any unnecessary rows in the dataset and converted the data from characters to numeric data. I then calculated the percentage of households with no central heating in each healthboard.
scot_hb <- st_read(here("data", "SG_NHS_HealthBoards_2019", "SG_NHS_HealthBoards_2019.shp")) %>%
mutate(HBName = paste("NHS", HBName)) #add NHS before healthboard name
## Reading layer `SG_NHS_HealthBoards_2019' from data source
## `C:\Users\Admin\OneDrive\Documents\data_science\B253518\data\SG_NHS_HealthBoards_2019\SG_NHS_HealthBoards_2019.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 14 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 5512.998 ymin: 530250.8 xmax: 470332 ymax: 1220302
## Projected CRS: OSGB36 / British National Grid
#join heating data with geographical data of healthboard boundaries
heating_map_data <- heating_data2022 %>%
full_join(scot_hb, join_by("hb_area2019" == "HBName"))
heating_map <- heating_map_data %>%
ggplot() +
geom_sf(aes(geometry = geometry, fill = percent_no_heating)) +
scale_fill_viridis_c(name = "% of households with no heating") +
labs(title = "Distribution of households with no central heating",
subtitle = "by Healthboards 2022")
heating_map
Next, I loaded the shapefile data for the Scottish healthboards. I joined the heating data for each healthboard to the shapefile. I used this to create a visualisation of the distribution of households with no central heating. Interestingly, healthboards further north, which have generally colder conditions, have a greater proportion of households with no central heating than those in the south. This may be due to the presence of a higher rural population in the south of Scotland, which may not have access to the technology required for central heating (Scottish Government, 2021).
To find the shapefiles go to: https://www.data.gov.uk/dataset/27d0fe5f-79bb-4116-aec9-a8e565ff756a/nhs-health-boards-scotland
#join average_salbutamol with heating data
final_salbutamol2122 <- salbut_population_2122 %>%
full_join(heating_data2022, by = (c("hb_name" = "hb_area2019"))) %>%
rename("healthboard" = hb_name)
#plot the relationship
interactive_plot <- final_salbutamol2122 %>%
ggplot(aes(x = percent_no_heating,
y = average_salbut_per100k,
colour = healthboard)) +
geom_point() +
theme_minimal() +
labs(title = "Average salbutamol prescriptions against % of households without heating",
subtitle = "in each Scottish healthboard winter months 2021/2022",
x = "no central heating (%)",
y = "Salbutamol prescriptions per 100k people") +
theme(plot.title = element_text(hjust = 0, size = 10),
plot.subtitle = element_text(hjust = 0))
ggplotly(interactive_plot) %>%
layout(title = list(text = paste0("Average salbutamol prescriptions against % of households without heating",
"<br>",
"<sup>",
"in each Scottish healthboard winter months 2021/2022", "</sup>"
)))
Finally, I plotted the relationship between percentage of households with no heating and prescriptions per 100k of the population. The results were interesting, since some of the healthboards with a higher proportion of households with no heating (e.g. NHS Orkney) had fewer prescriptions of salbutamol than other areas that had lower proportions of households with no heating.This is unexpected since we know the cold exacerbates respiratory conditions (Asthma and Lung UK, 2023).
For future research, calculating statistics to test if there is a correlation between these two factors may prove beneficial in determining whether the relationship is likely to be significant. It would also be beneficial to look at other socioeconomic demographics, e.g. SIMD, which may give insight into the unusual relationship seen in the above plot.
If you hover over each point, you will be able to see more detailed descriptive statistics.
I did not use generative AI (ChatGPT etc.) to support me in this assessment.
Reference List
Asthma and Lung UK (2023). Cold Weather and Your Lungs. Available at: https://www.asthmaandlung.org.uk/living-with/cold-weather. [Accessed 1 Dec. 2025]
NHS (2025). About salbutamol inhalers. Available at: https://www.nhs.uk/medicines/salbutamol-inhaler/about-salbutamol-inhalers/ [Accessed 1 Dec. 2025].
Public Health Scotland (2021). Special Health Boards and National Facilities. Available at: https://www.opendata.nhs.scot/dataset/non-standard-geography-codes-and-labels/resource/0450a5a2-f600-4569-a9ae-5d6317141899 [Accessed 1 Dec. 2025].
Public Health Scotland (2024). Geography, population and deprivation support - Geography - NHS board or health board. Available at: https://publichealthscotland.scot/resources-and-tools/health-intelligence-and-data-management/geography-population-and-deprivation-support/geography/nhs-board-or-health-board/ [Accessed 1 Dec. 2025].
Scottish Government (2021). Rural Scotland Key Facts 2021. Available at: https://www.gov.scot/publications/rural-scotland-key-facts-2021/ [Accessed 1 Dec. 2025].